home *** CD-ROM | disk | FTP | other *** search
Wrap
SHMEM_COLLECT(3) SHMEM_COLLECT(3) NNNNAAAAMMMMEEEE sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt4444, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt8888, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt33332222, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt66664444, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt4444, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt8888, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt33332222, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt66664444 - Concatenates blocks of data from multiple processing elements (PEs) to an array in every PE SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS C or C++: ####iiiinnnncccclllluuuuddddeeee <<<<mmmmpppppppp////sssshhhhmmmmeeeemmmm....hhhh>>>> vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt33332222((((vvvvooooiiiidddd ****_t_a_r_g_e_t,,,, ccccoooonnnnsssstttt vvvvooooiiiidddd ****_s_o_u_r_c_e,,,, ssssiiiizzzzeeee____tttt _n_l_o_n_g,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt66664444((((vvvvooooiiiidddd ****_t_a_r_g_e_t,,,, ccccoooonnnnsssstttt vvvvooooiiiidddd ****_s_o_u_r_c_e,,,, ssssiiiizzzzeeee____tttt _n_l_o_n_g,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt33332222((((vvvvooooiiiidddd ****_t_a_r_g_e_t,,,, ccccoooonnnnsssstttt vvvvooooiiiidddd ****_s_o_u_r_c_e,,,, ssssiiiizzzzeeee____tttt _n_l_o_n_g,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; vvvvooooiiiidddd sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt66664444((((vvvvooooiiiidddd ****_t_a_r_g_e_t,,,, ccccoooonnnnsssstttt vvvvooooiiiidddd ****_s_o_u_r_c_e,,,, ssssiiiizzzzeeee____tttt _n_l_o_n_g,,,, iiiinnnntttt _P_E__s_t_a_r_t,,,, iiiinnnntttt _l_o_g_P_E__s_t_r_i_d_e,,,, iiiinnnntttt _P_E__s_i_z_e,,,, lllloooonnnngggg ****_p_S_y_n_c))));;;; Fortran: IIIINNNNCCCCLLLLUUUUDDDDEEEE """"mmmmpppppppp////sssshhhhmmmmeeeemmmm....ffffhhhh"""" IIIINNNNTTTTEEEEGGGGEEEERRRR _n_l_o_n_g IIIINNNNTTTTEEEEGGGGEEEERRRR _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e IIIINNNNTTTTEEEEGGGGEEEERRRR _p_S_y_n_c((((SSSSHHHHMMMMEEEEMMMM____CCCCOOOOLLLLLLLLEEEECCCCTTTT____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____CCCCOOOOLLLLLLLLEEEECCCCTTTT4444((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_l_o_n_g,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____CCCCOOOOLLLLLLLLEEEECCCCTTTT8888((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_l_o_n_g,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____FFFFCCCCOOOOLLLLLLLLEEEECCCCTTTT4444((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_l_o_n_g,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_S_y_n_c)))) CCCCAAAALLLLLLLL SSSSHHHHMMMMEEEEMMMM____FFFFCCCCOOOOLLLLLLLLEEEECCCCTTTT8888((((_t_a_r_g_e_t,,,, _s_o_u_r_c_e,,,, _n_l_o_n_g,,,, _P_E__s_t_a_r_t,,,, _l_o_g_P_E__s_t_r_i_d_e,,,, _P_E__s_i_z_e,,,, _p_S_y_n_c)))) DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN The shared memory (SHMEM) collective routines concatenate _n_l_o_n_g 64- or 32-bit data items from the source array into the target array, over the set of PEs defined by _P_E__s_t_a_r_t, _l_o_g_2_P_E__s_t_r_i_d_e, and _P_E__s_i_z_e, in processor number order. The resultant target array contains the contribution from PE _P_E__s_t_a_r_t first, then the contribution from PE _P_E__s_t_a_r_t + _P_E__s_t_r_i_d_e second, and so on. The collected result is written to the target array for all PEs in the active set. The resulting target array is as follows: ---------------------------------------------------------- source(1.._n_l_o_n_g)))) ffffrrrroooommmm PPPPEEEE ((((_P_E__s_t_a_r_t ++++ 0000 **** ((((2222********_l_o_g_P_E__s_t_r_i_d_e)))))))) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ssssoooouuuurrrrcccceeee((((1111........_n_l_o_n_g)))) ffffrrrroooommmm PPPPEEEE ((((_P_E__s_t_a_r_t ++++ 1111 **** ((((2222********_l_o_g_P_E__s_t_r_i_d_e)))))))) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ............ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ssssoooouuuurrrrcccceeee((((1111........_n_l_o_n_g)))) ffffrrrroooommmm PPPPEEEE ((((_P_E__s_t_a_r_t ++++ ((((_P_E__s_i_z_e ---- 1111)))) **** ((((2222********_l_o_g_P_E__s_t_r_i_d_e)))))))) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- As with all SHMEM collective routines, each of these routines assumes that only PEs in the active set call the routine. If a PE not in the active set calls a SHMEM collective routine, undefined behavior results. The arguments are as follows: _t_a_r_g_e_t A symmetric array. The _t_a_r_g_e_t argument must be large enough to accept the concatenation of the source arrays on all PEs. The data types are as follows: * For sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt8888, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt66664444, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt8888, and sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt66664444, any data type with an element size of 64 bits. Fortran derived types, Fortran character type, and C/C++ structures are not permitted. * For sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt4444, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt33332222, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt4444, and sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt33332222, any data type with an element size of 32 bits. Fortran derived types, Fortran character type, and C/C++ structures are not permitted. _s_o_u_r_c_e A symmetric data object that can be of any type permissible for the _t_a_r_g_e_t argument. _n_l_o_n_g The number of elements in the source array. _n_l_o_n_g must be of type integer. If you are using Fortran, it must be a default integer value. The _n_l_o_n_g argument must be equal on all PEs for sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt8888, sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt66664444, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt8888, sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt33332222, and sssshhhhmmmmeeeemmmm____ffffccccoooolllllllleeeecccctttt66664444. The _n_l_o_n_g argument can be different across PEs for sssshhhhmmmmeeeemmmm____ccccoooolllllllleeeecccctttt33332222. _P_E__s_t_a_r_t The lowest virtual PE number of the active set of PEs. _P_E__s_t_a_r_t must be of type integer. If you are using Fortran, it must be a default integer value. _l_o_g_P_E__s_t_r_i_d_e The log (base 2) of the stride between consecutive virtual PE numbers in the active set. _l_o_g_P_E__s_t_r_i_d_e must be of type integer. If you are using Fortran, it must be a default integer value. _P_E__s_i_z_e The number of PEs in the active set. _P_E__s_i_z_e must be of type integer. If you are using Fortran, it must be a default integer value. _p_S_y_n_c A symmetric work array. In C/C++, _p_S_y_n_c must be of type iiiinnnntttt and size ____SSSSHHHHMMMMEEEEMMMM____CCCCOOOOLLLLLLLLEEEECCCCTTTT____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE. In Fortran, _p_S_y_n_c must be of type integer and size SSSSHHHHMMMMEEEEMMMM____CCCCOOOOLLLLLLLLEEEECCCCTTTT____SSSSYYYYNNNNCCCC____SSSSIIIIZZZZEEEE. If you are using Fortran, it must be a default integer value. Every element of this array must be initialized with the value ____SSSSHHHHMMMMEEEEMMMM____SSSSYYYYNNNNCCCC____VVVVAAAALLLLUUUUEEEE in C/C++ or SSSSHHHHMMMMEEEEMMMM____SSSSYYYYNNNNCCCC____VVVVAAAALLLLUUUUEEEE in Fortran before any of the PEs in the active set enter sssshhhhmmmmeeeemmmm____bbbbaaaarrrrrrrriiiieeeerrrr(). The values of arguments _P_E__s_t_a_r_t, _l_o_g_P_E__s_t_r_i_d_e, and _P_E__s_i_z_e must be equal on all PEs in the active set. The same target and source arrays and the same _p_S_y_n_c work array must be passed to all PEs in the active set. Upon return from a collective routine, the following are true for the local PE: * The target array is updated. * The data cache region that is mapped to the target data object is coherent in all receiving PEs. * The values in the _p_S_y_n_c array are restored to the original values. NNNNOOOOTTTTEEEESSSS The terms _c_o_l_l_e_c_t_i_v_e and _s_y_m_m_e_t_r_i_c are defined in iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3). All SHMEM collective routines reset the values in _p_S_y_n_c before they return, so a particular _p_S_y_n_c buffer need only be initialized the first time it is used. You must ensure that the _p_S_y_n_c array is not being updated on any PE in the active set while any of the PEs participate in processing of a SHMEM collective routine. Be careful to avoid these situations: * If the _p_S_y_n_c array is initialized at run time, some type of synchronization is needed to ensure that all PEs in the working set have initialized _p_S_y_n_c before any of them enter a SHMEM routine called with the _p_S_y_n_c synchronization array. * A _p_S_y_n_c array can be reused on a subsequent SHMEM collective routine only if none of the PEs in the active set are still processing a prior SHMEM collective routine call that used the same _p_S_y_n_c array. In general, this may be ensured only by doing some type of synchronization. However, in the special case of SHMEM routines being called with the same active set, you can allocate two _p_S_y_n_c arrays and alternate between them on successive calls. The collective routines operate on active PE sets that have a non-power-of-two _P_E__s_i_z_e with some performance degradation. They operate with no performance degradation when _n_l_o_n_g is a non-power-of-two value. EEEEXXXXAAAAMMMMPPPPLLLLEEEESSSS C/C++: for (i=0; i < _SHMEM_COLLECT_SYNC_SIZE; i++) { pSync[i] = _SHMEM_SYNC_VALUE; } shmem_barrier_all(); /* Wait for all PEs to initialize pSync */ shmem_collect32(target, source, 64, pe_start, logPE_stride, pe_size, pSync); Fortran: INTEGER PSYNC(SHMEM_COLLECT_SYNC_SIZE) DATA PSYNC /SHMEM_COLLECT_SYNC_SIZE*SHMEM_SYNC_VALUE/ CALL SHMEM_COLLECT4(TARGET, SOURCE, 64, PE_START, LOGPE_STRIDE, & PE_SIZE, PSYNC) SSSSEEEEEEEE AAAALLLLSSSSOOOO iiiinnnnttttrrrroooo____sssshhhhmmmmeeeemmmm(3)